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1. INTRODUCTION 

Today, public and private prominent areas are monitored with surveillance cameras due to increased 
security concerns. It is a difficult task to monitor pedestrian behavior by human supervision. The current 
problem with the traditional approach is that it is difficult to detect and separate suspicious activities from 
real-time video, and this procedure is extensive and time-consuming. Due to the current limitations of the 
system, we need an intelligent video surveillance system that can automatically recognize suspicious 
behavior in real time. Many researchers and practitioners in the fields of computer vision and video analysis 
have dedicated their efforts in recent years to recognizing human movement and behavior in video 
frames [1]—[3]. In recent years, researchers have concentrated increasingly on the detection of suspicious 
activity in high-density areas. Traditional systems cannot detect and track pedestrians in high density areas 
due to full or partial occlusion of objects, changes in item size, changes in ambient lighting, and other factors. 
Many authors have attempted to detect abnormal behaviour in overcrowded environments using texture- 
based information, such as time gradients [4], dynamic texture characteristics [5] and the spatiotemporal 
frequency properties [6], [7]. Other groups concentrate on optical flows, which recognize motion features in 
video frames directly, such as multi-scale pedestrian features [8], fuzzy clustering based features [9], 
behavioural model for pedestrian detection [10], convolutional neural networks (CNN) features [11], 
weighted autoencoder based features [12], trajectory based features [13], student object behavioral features 
[14], multi-target association based features [15], [16]. Previous research has shown that the technique of 
motion is beneficial, and we believe that the present methods can still be improved. It is essential to provide 
data on objects of different sizes, motion direction, speed, and inter-frame interactions. We can increase 
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proposed method performance if we analyze the facts concerning this movement. In the proposed motion 
pattern-based method, we distinguish moving pedestrian using motion information, orientation, pedestrian 
size, and interaction within the video sequence, specifically in highly dense regions. Figure | illustrate the 
student suspicious behavior examples. Figure 1(a) depicts the unusual activity of student in which student 
stealing the mobile phone of another student. Figure 1(b) depicts the student dispute behavior in the lab. A 
small active segment of an anomalous area is classified as a local area. Meanwhile, a global region refers to 
the area where strange behavior is observed. In the literature, various ways to identify the region's unusual 
activity have been introduced. 


L 


t 


/ 
!(a) Student stealing mobile 
phone of another student 


(b) Student dispute in the lab 


Figure 1. Student suspicious behavior examples (a) student steal the mobile phone of another student 
(specific area) and (b) student dispute in the lab. (full frame) 


Li et.al. [15] presented a social force map-based strategy for detecting worldwide anomalous 
activity. They placed a particle grid across the optical flow field and computed the interaction force between 
each particle. Tufek and Ozkaya [11] described a method for detecting erroneous local motions in a scene. To 
detect anomalies in the immediate area, they created a saliency feature map employing optical flow features 
at various sizes. A comprehensive framework for detecting abnormal activity is required for a real-time 
monitoring system. The proposed contribution is outlined: i) at the pixel level, we proposed a unique 
approach for detecting behavioral patterns in academic environment; and ii) on our proposed database as well 
as other benchmark datasets, such as the University of Minnesota anomaly detection datasets, we assessed 
the effectiveness of the proposed motion-based method. The following is how the rest of the contribution is 
organized: The existing contribution in this area is outlined in section 2. We propose a motion feature-based 
suspicious activity detection in section 3. Section 4 covers a new pedestrian behavior database developed in 
an academic setting, as well as results and performance comparison with other relevant methodologies. The 
last section concludes with a research direction. 
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2. LITERATURE SURVEY 

Researchers in a smart video surveillance system have recently become interested in the unusual 
activity detection. Gaddigoudar et al. [17] describe the difficulty of understanding and modelling behavior in 
surveillance video. In an unsupervised learning framework, violations are discovered using likelihood ratio 
analysis and pedestrian legal action categories. Mehmood [18] presented a method for detecting anomalies in 
a spatiotemporal environment. They showed an atomic event for a single object in a scene that includes the 
object's position, movement, direction, and velocity. To characterize valid occurrences, they use an 
aggregation of three partitioned atomic events. In crowded scenes, it's hard to comprehend moving 
pedestrians. Therefore, the aforementioned strategies aren't appropriate. 

The multi-scale pedestrian detected using a deep learning architecture. The CNN features are used 
for classification of pedestrians in highly dense regions. The different issues and challenges addressed using 
author approaches are scale and illumination variation. Other recent research groups have concentrated on 
moving pedestrian motion orientation and speed information. The Kanade Lucas-Tomasi (KLT) approach [9] 
is employed by Zhang et al. [19], in which corner points are used to display pedestrians that are moving and 
cluster the motion information features in the controlled environment. The author used two types of historical 
and self-history descriptors, as well as neighbouring object histories, to detect abnormalities in a scene [10]. 
Chebli and Khalifa [20] presented a method for identifying the number of humans in an image without the 
use of a camera. They exploited foreground relationships as well as an optical-flow motion pattern. They 
calculated the dynamic energy of using optical flow to distinguish between walking and running activities, as 
well as crowd exponential distribution patterns. 

Other academics have focused on understanding and modelling crowd behavior [21]-[24]. Several 
strategies were used to detect worldwide anomalous activity by modelling the crowd's behavior. Wang and 
Hou [22] author used the social force model to characterize crowd behavior [22]. The moving object optical 
flow pattern is computed for pedestrian detection in the crowded environment [14], [25]. The behavior 
classification of pedestrians performed by social force was also determined using latent Dirichlet allocation 
(LDA). Minguez et al. [26] use interactive energy potentials to study social behavior and its behavior. 
Zhang et al. [19] used the KLT feature-based pedestrian tracker. In this motion, characteristics are computed 
using temporal distinct points. It calculates interaction energy potentials based on the velocities of 
spatiotemporal interest points to see if they will collide soon [27]. Other research groups, on the other hand, 
have concentrated on detecting local aberrant activity. Quantifiable is a term used in [11] to describe the 
global rarity of picking uncorrelated motions from a spatial context. They calculated the index across 
numerous channels with varying velocities and directions. 

Using the associated saliency map, they were almost able to detect local aberrant behavior. Zaki and 
Sayed [28] the author used motion intensities for creating a motion heat map and compared it to local motion 
fluctuations. Direkoglu [5] the author proposed a texture feature-based abnormal activity recognition in 
highly dense areas. Finally, crowd behavior analysis was performed using moving objects, optical flow 
patterns and directional information. Papathanasopoulou et al. [29] the author proposed an approach for 
highly dense areas for extracting the moving objects within the cluster of frames. Although the 
aforementioned strategies have been shown to be useful in studies, they are usually limited to detecting 
unusual activity in a local or global location. We contend that joint contemplation of the motion flows 
pattern, variable item sizes, and interactions between neighboring objects in a frame can reflect pedestrian 
activities in a high-density scene, resulting in improved performance in detecting unusual activity. We 
proposed an effective technique for dealing with the aforementioned issues and challenges with clustering of 
motion patterns in consecutive frames. We begin by extracting motion features at different scales and 
directions with sequences of frames. Furthermore, we also used motion analysis to identify abnormal 
activities within a scene. 


3. PROPOSED METHODOLOGY 
We describe a strategy for identifying and localizing abnormal activity in high-density zones in this 
section, which incorporates the motion component. We detected fast and slow movement of a pedestrian’s 
unusual activities in the local and global regions of the scene. Figure 2 depicts the general architecture of the 
suggested technique. Each frame is broken down into blocks, and motion data is retrieved to generate motion 
characteristics at the pixel and block levels. The motion feature extraction process is divided into following 
stages: 
- First, within a series of frames, the motion characteristics of moving pedestrians are extracted at spatial 
plane coordinates and block by block incrementally in different orientation and scale. 
- After integrating the motion information single feature matrix is generated that represents both spatial and 
temporal characteristics. 
- To classify the activity k-means clustering for each zone applied to identify the global and local region. 
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- The euclidean distances computed across the frames. Again, the directional motion pattern are the distinct 
feature values for the detection of abnormal behavior in academic environments. 

- Once a frame has been categorized as unusual, we use pixel-level localization to determine the specific 
location of the unexpected behavior. The process will continue until the video sequence is complete. 


Input Video 


Motion Information 


Motion Force Map 


Feature Extraction 


Unusual activity 
detection at the 
frame-level 


Figure 2. Proposed framework architecture to represent moving object behavior in academic environment 


A method for detects an unusual human activity by processing each frame of the video sequence, the 
proposed method first extracts the optical flows of the pixels within each block of frame represented in (1). 


_1y” j 
B= 5) fom a) 


Where, B; denotes the i block optical flow, pixel size is represented by J, i block j" pixel optical flow 
represented by f(x, ar . Next, the threshold T, for the block computed using the motion vector B; and block 
width S represented in (2). 


Ta — ByxS (2) 
The angle between the feature vector 0;; computed using ED« j) 


ne EDGj) < Ty 
40 Otherwise 


The motion feature extraction process defined by (3). 


MF = MF (6B,)+ ED@y G3) 


Bi 
Where in MF is the motion feature map, ED( ;) Euclidean distance between object t and j. Next, we have 
described motion feature extraction briefly in the algorithm. 

In addition, clustered at frame level defined the motion region in a frame, each cluster optical flow 
of a pixel in a different direction being considered as the feature vector. Whenever distance between blocks 
decreases, the probability of unexpected behavior in the corresponding block decreases. If a larger value 
distance is determined, then we can classify anomalous actions in consecutive frames. As a result, if the 
distance is over a set limit of the constant threshold value, the current scene is recognized as an unusual 
activity frame. Next, we describe the experimental results. 
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Algorithm: Motion information extraction for detection of unusual human activity from video. 


Inputs: V, the input video sequence 
n, Last frame of video 
S, Block width 
K, Each frame block size 
ff the frame of the input video sequence 
B, Motion vector 
Output: MF, the motion Features 
3 Read each frame of video 
for f= 1 to N do 
Process each block of frame 
for i= 1 to K do 
Compute the threshold for each block 


Tg — Bex S 
4 Process adjacent block in a frame 
for j = 1 to K do 
5 Compute the centroid from the bounding box points. represented by: (cX; cY) 


Append centroid in centroid dictionary 
If i not equal j then 
Compute the distance across the consecutive block 
EDG,j <— EucliDist(B,,B;) 


6 Compute distance and compare against threshold 
if | ED j) < Tal then 

7 Compute Angle 0;; between B; and B; 

if — OB, < 0;; < OB; then 

9 MF < MF (@B;) + EDgj)/B; 

10 end if 

i end if 

12 end if 

13 end for 

14 end for 

15 end for 


4. RESULTS AND DISCUSSION 

We validated the proposed approach accuracy on public datasets, as well as the suggested student 
behavior dataset and the University of Minnesota anomaly detection datasets. The experiments and proposed 
deep learning framework were carried out using a single NVIDIA graphics processing unit (GPU) and an 
Intel Core i7 3.4GHz processor with 32GB random-access memory (RAM) and a 32GB NVIDIA graphics 
card, all of which were configured using CUDA-optimized architecture and the open source computer vision 
library (OpenCV) deep learning framework. The suggested method is compared to the state-of-the-art 
unusual activity detection methods [29], [30], social force models [31], sparse representation-based 
method [32], and mixture of dynamic textures-based method [32]. True positive (TP), true negative (TN), 
false positive (FP), false negative (FN), equal error rate (ERR), true positive rate (TPR), true negative rate 
(TNR), and area under curve (AUC) are some common performance measuring metrics. We have computed 
these metrics using (4), (5), (6), and (7). These metrics are computed for the performance comparison. 


TPR = —~ (4) 
TP+TN 

TNR = —~ (5) 
FP+FN 

AUC = ~ (TPR + TNR) (6) 

ERR =1- : «(TPR +TNR) (7) 


First, we have performed experiments on University of Minnesota anomaly detection dataset. Figure 
3 shows the receiving operating curve (ROC) for the proposed and existing approach presented in [31], [32]. 
Similarly, we have computed the ROC for the proposed student behavior dataset in Figure 4. After analysing 
the ROC curve, the proposed framework is efficient and outperformed the method available in the literature. 
For quantitative comparative analysis, we have computed ERR for the existing and proposed method as 
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shown in Table 1 and Table 2 for both the datasets. We observed that the proposed method gives less error 
rate, comparatively, on both the datasets, i.e., 16.1% and 18.1% respectively. Again, AUC for the existing 
and proposed method as shown in Table 3 and Table 4 for both the dataset. As illustrated in the table AUC 
for the proposed method is 73.2% and 72.1%. It shows that the proposed motion pattern-based approach is 
more efficient, robust, and accurate on both the dataset and is comparatively better than existing approaches 
of the suspicious behavior. 
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Figure 3. FPR and TPR for the existing and proposed Figure 4. FPR and TPR for the existing and 
motion feature map-based method for University of proposed motion feature map-based method for 
Minnesota anomaly detection dataset proposed student behavior dataset 


Table 1. ERR for the existing and proposed motion feature map-based method on University of Minnesota 
anomaly detection dataset 


Methodology Ped. 1 Ped. 2 Avg. 
Social Force Model [32] 36.5% 35.0% 35.7% 
Sparce Representation [31] 35.6% 35.8% 35.7% 
Mixture of Dynamic Texture [32] 22.9% 22.9% 22.9% 
Proposed Motion Feature based method 21.1% 18.1% 16.1% 


Table 2. ERR for the existing and proposed motion feature map-based method on student behavior dataset 


Methodology Ped. 1 Ped. 2 Avg. 
Social Force Model [32] 37.5% 34.0% 36.7% 
Sparce Representation [31] 32.6% 35.8% 33.7% 
Mixture of Dynamic Texture [32] 24.9% 23.9% 23.5% 
Proposed Motion Feature based method 22.1% 19.2% 18.1% 


Table 3. AUC for the existing and proposed motion feature map-based method on University of Minnesota 
anomaly detection dataset 


Methodology Ped. 1 Ped. 2 Avg. 
Social Force Model [32] 40.9% 27.6% 34.2% 
Sparce Representation [31] 32.6% 22.4% 27.5% 
Mixture of Dynamic Texture [32] 59.3% 56.8% 58.0% 
Proposed Motion Feature based method 64.9% 81.5% 73.2% 


Table 4. AUC for the existing and proposed motion feature map-based method on proposed student behavior 


dataset 
Methodology Ped. 1 Ped. 2 Avg. 
Social Force Model [32] 50.8% 63.4% 57.1% 
Sparce Representation [31] T4.5% 70.1% 72.3% 
Mixture of Dynamic Texture [32] 35.6% 35.8% 35.7% 
Proposed Motion Feature based method 63.4% 80.2% 72.1% 


5. CONCLUSION 
In this paper, we have proposed a novel method to detect the unusual human activities in an 
academic environment. Due to the spatial and temporal features of motion features, we can classify frames as 
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normal or abnormal activity of the pedestrian and also able to locate regions of abnormal activity within the 
frame as local or global region. We conducted experiments on the University of Minnesota anomaly 
detection datasets and the proposed student behavioural dataset. The proposed method was confirmed to be 
effective, surpassing other competing methods in the literature. However, the purpose of this research is to 
detect abnormal actions in an academic environment, for which cameras generally cover a large area. In 
future, same method can be used for the different scenarios of student behaviour such as the student 
examination cheating scenarios, student dispute in the campus, etc. Again, scale, rotation, and illumination 
changes can also be address if the proposed approach enhance with the additional features such as scale, 
rotation, and illumination invariant feature. 
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